Goto

Collaborating Authors

 specific model


Improving Procedural Skill Explanations via Constrained Generation: A Symbolic-LLM Hybrid Architecture

Dass, Rahul, Bowlin, Thomas, Li, Zebing, Jin, Xiao, Goel, Ashok

arXiv.org Artificial Intelligence

In procedural skill learning, instructional explanations must convey not just steps, but the causal, goal-directed, and compositional logic behind them. Large language models (LLMs) often produce fluent yet shallow responses that miss this structure. We present Ivy, an AI coaching system that delivers structured, multi-step explanations by combining symbolic Task-Method-Knowledge (TMK) models with a generative interpretation layer-an LLM that constructs explanations while being constrained by TMK structure. TMK encodes causal transitions, goal hierarchies, and problem decompositions, and guides the LLM within explicit structural bounds. We evaluate Ivy against responses against GPT and retrieval-augmented GPT baselines using expert and independent annotations across three inferential dimensions. Results show that symbolic constraints consistently improve the structural quality of explanations for "how" and "why" questions. This study demonstrates a scalable AI for education approach that strengthens the pedagogical value of AI-generated explanations in intelligent coaching systems.


Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer

Khan, Muhammad Tayyab, Yong, Zane, Chen, Lequn, Tan, Jun Ming, Feng, Wenhe, Moon, Seung Ki

arXiv.org Artificial Intelligence

Accurate extraction of key information from 2D engineering drawings is crucial for high - precision manufacturing. Manual extraction is slow and labor - intensive, while traditional Optical Character Recognition (OCR) techniques often struggle with complex layouts and overlapping symbols, resulting in unstructured outputs . To address these challenges, this paper proposes a novel hybrid deep learning framework for structured information extraction by integrat ing an O riented B ounding B ox (OBB) detection model with a transformer - based document parsing model (Donut). An in - house annotated dataset is used to train YOLOv11 for detect ing nine key categories: Geometric Dimensioning and Tolerancing (GD&T), General Tolerances, Measures, Materials, Notes, Radii, Surface Roughness, Threads, and Title Blocks. Detected OBBs are cropped into image s and labeled to fine - tune Donut for structured JSON output. Fine - tuning strategies include a single model trained across all categories and category - specific models . Results show that the single model consistently outperforms category - specific ones across all evaluation metrics, achieving higher precision (94.77% for GD&T), recall (100% for most categories), and F1 score (97.3%), while reducing hallucination s (5.23%) . The proposed framework improves accuracy, reduces manual effort, and supports scalable deployment in precision - driven industries.


MAGIC: Near-Optimal Data Attribution for Deep Learning

Ilyas, Andrew, Engstrom, Logan

arXiv.org Machine Learning

A fundamental problem when building machine learning syste ms is to predict counterfactuals about model behavior. For example, scaling laws [ KMH+20; Has21; MRB+23 ] aim to predict the performance of systems trained with more data and more co mpute than is currently available; interpretability techniques [ KWG+18 ] predict how models behave under counterfactual inputs. Analogously, in this work we study predictive data attribution (or datamodeling [ IPE+22 ]), where the goal is to predict how a model would behave if it had been tr ained on a different dataset. This well-studied problem encompasses, e.g., estimating the ef fect (on the resulting trained model's predictions) of modifying a training example [ KL17 ], removing a group of training examples [ KAT+19; BNL+22; PGI+23 ], or adding entire training data sources [ LSZ+24 ]. Predictive data attribution in large-scale settings is cha llenging: it requires simulating training a model on a different dataset without actually training [ GWP+23; IGE+24 ]. In "classical" settings--when learning corresponds to minimizing a convex loss--statistical tools like the influence function [ Ham47 ] allow us to accurately and efficiently estimate how differen t training data choices change trained model predictions [ RM18; KAT+19; GSL+19 ]. However, in the non-convex settings that are ubiquitous in natural domains like langua ge/vision, current methods are less effective. Indeed, the best existing methods produce estimat es that typically (a) only moderately correlate with the ground truth [ BPF21; BNL+22; PGI+23 ] and (b) incur large absolute error [ BNL+22 ].


FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting

Li, Zhe, Qiu, Xiangfei, Chen, Peng, Wang, Yihang, Cheng, Hanyin, Shu, Yang, Hu, Jilin, Guo, Chenjuan, Zhou, Aoying, Wen, Qingsong, Jensen, Christian S., Yang, Bin

arXiv.org Artificial Intelligence

Time Series Forecasting (TSF) is key functionality in numerous fields, including in finance, weather services, and energy management. While TSF methods are emerging these days, many of them require domain-specific data collection and model training and struggle with poor generalization performance on new domains. Foundation models aim to overcome this limitation. Pre-trained on large-scale language or time series data, they exhibit promising inferencing capabilities in new or unseen data. This has spurred a surge in new TSF foundation models. We propose a new benchmark, FoundTS, to enable thorough and fair evaluation and comparison of such models. FoundTS covers a variety of TSF foundation models, including those based on large language models and those pretrained on time series. Next, FoundTS supports different forecasting strategies, including zero-shot, few-shot, and full-shot, thereby facilitating more thorough evaluations. Finally, FoundTS offers a pipeline that standardizes evaluation processes such as dataset splitting, loading, normalization, and few-shot sampling, thereby facilitating fair evaluations. Building on this, we report on an extensive evaluation of TSF foundation models on a broad range of datasets from diverse domains and with different statistical characteristics. Specifically, we identify pros and cons and inherent limitations of existing foundation models, and we identify directions for future model design. We make our code and datasets available at https://anonymous.4open.science/r/FoundTS-C2B0.


Aligning Models with Their Realization through Model-based Systems Engineering

Zenz, Lovis Justin Immanuel, Heiland, Erik, Hillmann, Peter, Karcher, Andreas

arXiv.org Artificial Intelligence

In this paper, we propose a method for aligning models with their realization through the application of model-based systems engineering. Our approach is divided into three steps. (1) Firstly, we leverage domain expertise and the Unified Architecture Framework to establish a reference model that fundamentally describes some domain. (2) Subsequently, we instantiate the reference model as specific models tailored to different scenarios within the domain. (3) Finally, we incorporate corresponding run logic directly into both the reference model and the specific models. In total, we thus provide a practical means to ensure that every implementation result is justified by business demand. We demonstrate our approach using the example of maritime object detection as a specific application (specific model / implementation element) of automatic target recognition as a service reoccurring in various forms (reference model element). Our approach facilitates a more seamless integration of models and implementation, fostering enhanced Business-IT alignment.


Ghostbuster: detecting text ghostwritten by large language models

AIHub

Large language models like ChatGPT write impressively well--so well, in fact, that they've become a problem. Students have begun using these models to ghostwrite assignments, leading some schools to ban ChatGPT. In addition, these models are also prone to producing text with factual errors, so wary readers may want to know if generative AI tools have been used to ghostwrite news articles or other sources before trusting them. What can teachers and consumers do? Existing tools to detect AI-generated text sometimes do poorly on data that differs from what they were trained on.


Safer Together: Machine Learning Models Trained on Shared Accident Datasets Predict Construction Injuries Better than Company-Specific Models

Tixier, Antoine J. -P., Hallowell, Matthew R.

arXiv.org Artificial Intelligence

In this study, we capitalized on a collective dataset repository of 57k accidents from 9 companies belonging to 3 domains and tested whether models trained on multiple datasets (generic models) predicted safety outcomes better than the company-specific models. We experimented with full generic models (trained on all data), per-domain generic models (construction, electric T&D, oil & gas), and with ensembles of generic and specific models. Results are very positive, with generic models outperforming the company-specific models in most cases while also generating finer-grained, hence more useful, forecasts. Successful generic models remove the needs for training company-specific models, saving a lot of time and resources, and give small companies, whose accident datasets are too limited to train their own models, access to safety outcome predictions. It may still however be advantageous to train specific models to get an extra boost in performance through ensembling with the generic models. Overall, by learning lessons from a pool of datasets whose accumulated experience far exceeds that of any single company, and making these lessons easily accessible in the form of simple forecasts, generic models tackle the holy grail of safety cross-organizational learning and dissemination in the construction industry.


A Journey into the Fabulous Applications of Transformers -- Part 2 – Towards AI

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. Transformer architecture is widely used in Natural Language Processing and it highly contributed to the need-of-the-hour Large Language Models (LLM).


Federated Learning based Energy Demand Prediction with Clustered Aggregation

Tun, Ye Lin, Thar, Kyi, Thwal, Chu Myaet, Hong, Choong Seon

arXiv.org Artificial Intelligence

To reduce negative environmental impacts, power stations and energy grids need to optimize the resources required for power production. Thus, predicting the energy consumption of clients is becoming an important part of every energy management system. Energy usage information collected by the clients' smart homes can be used to train a deep neural network to predict the future energy demand. Collecting data from a large number of distributed clients for centralized model training is expensive in terms of communication resources. To take advantage of distributed data in edge systems, centralized training can be replaced by federated learning where each client only needs to upload model updates produced by training on its local data. These model updates are aggregated into a single global model by the server. But since different clients can have different attributes, model updates can have diverse weights and as a result, it can take a long time for the aggregated global model to converge. To speed up the convergence process, we can apply clustering to group clients based on their properties and aggregate model updates from the same cluster together to produce a cluster specific global model. In this paper, we propose a recurrent neural network based energy demand predictor, trained with federated learning on clustered clients to take advantage of distributed data and speed up the convergence process.


Identification of 12 cancer types through genome deep learning - Scientific Reports

#artificialintelligence

Cancer is a major cause of death worldwide, and an early diagnosis is required for a favorable prognosis. Histological examination is the gold standard for cancer identification; however, large amount of inter-observer variability exists in histological diagnosis. Numerous studies have shown cancer genesis is accompanied by an accumulation of harmful mutations, potentiating the identification of cancer based on genomic information. We have proposed a method, GDL (genome deep learning), to study the relationship between genomic variations and traits based on deep neural networks. We analyzed 6,083 samples’ WES (Whole Exon Sequencing) mutations files from 12 cancer types obtained from the TCGA (The Cancer Genome Atlas) and 1,991 healthy samples’ WES data from the 1000 Genomes project. We constructed 12 specific models to distinguish between certain type of cancer and healthy tissues, a total-specific model that can identify healthy and cancer tissues, and a mixture model to distinguish between all 12 types of cancer based on GDL. We demonstrate that the accuracy of specific, mixture and total specific model are 97.47%, 70.08% and 94.70% for cancer identification. We developed an efficient method for the identification of cancer based on genomic information that offers a new direction for disease diagnosis.